利用Gemini和Streamlit在Cloud Run上建立AI Assistant

Streamlit 在過去是設計給資料科學家和開發者能夠利用Python 程式碼快速建立dashboard與app的框架，主打不需要任何的前端開發經驗，與先前其他框架像是Gradio與OpenUI 在這幾年都是很多開發者會拿來搭建大語言模型Web應用的起手式。

先前有介紹過Cloud Run ＋ Gradio的應用，這次的範例我們會利用GCP Cloud Run 跟配合 Vertex AI來搭建一個AI助手作為範例。

整個demo的專案是base在codespace上，越來越懶得動自己的開發機了XD

gcp-ai-assistant/  
│  
├── main.py             # 修改後的主應用程式碼  
├── Dockerfile          # 用於建立容器映像檔  
├── requirements.txt    # Python 相依套件  
└── .streamlit/  
    └── secrets.toml    # 存放 GCP 專案 ID 和地區

接下來介紹一下main.py中大致上在做什麼：

Vertex AI 初始化： 使用 vertexai.init() 進行初始化。所需的 project 和 location (region) 會先嘗試從 Streamlit secrets 讀取（方便本地開發），若失敗則從環境變數讀取。
模型呼叫： 使用 GenerativeModel 來與 Gemini API 互動。start_chat 方法可以用來處理多輪對話。
串流處理： Vertex AI SDK 的 send_message 函式設定 stream=True 後會返回一個 iterator，這個 iterator 可以直接交給 Streamlit 的 st.write_stream 函式，可以自動處理並將內容即時渲染到前端。
對話歷史： 我們將 Streamlit session_state 中儲存的對話歷史轉換成 Vertex AI SDK 需要的格式。

import os  
import textwrap  
import streamlit as st  
import vertexai  
from vertexai.generative_models import GenerativeModel, Part, Content  

# --- 配置 ---  
# 在 Cloud Run 環境中，GCP_PROJECT 和 GCP_REGION 會由環境變數提供  
# 在本地，可以從 secrets.toml 讀取  
try:  
    GCP_PROJECT = st.secrets["gcp"]["project_id"]  
    GCP_REGION = st.secrets["gcp"]["region"]  
except (FileNotFoundError, KeyError):  
    GCP_PROJECT = os.environ.get("GCP_PROJECT")  
    GCP_REGION = os.environ.get("GCP_REGION")  

MODEL_NAME = "gemini-2.5-flash"  
GCP_REGION = "us-central1" # Keeping us-central1 for now  
HISTORY_LENGTH = 5  

st.set_page_config(page_title="GCP AI Assistant", page_icon="")  

# --- 初始化 Vertex AI ---  
@st.cache_resource  
def init_vertexai():  
    """初始化 Vertex AI SDK"""  
    if not GCP_PROJECT or not GCP_REGION:  
        st.error("GCP Project ID 或 Region 未設定。請在 .streamlit/secrets.toml 中設定或設定環境變數。")  
        st.stop()  
    vertexai.init(project=GCP_PROJECT, location=GCP_REGION)  

@st.cache_resource  
def get_model():  
    """獲取生成模型"""  
    return GenerativeModel(MODEL_NAME)  

init_vertexai()  
model = get_model()  

# --- 核心邏輯 ---  

INSTRUCTIONS = textwrap.dedent("""  
    - You are a helpful AI chat assistant.  
    - Your main focus is on answering questions about Google Cloud, Vertex AI, and general Python.  
    - Use markdown for formatting, like headers (##), code blocks, and lists.  
    - Provide clear, concise answers and include code examples when helpful.  
    - Assume the user is a beginner.  
    - Be friendly and encouraging.  
""")  

def get_response_stream(user_prompt):  
    """  
    從 Vertex AI 獲取串流回應。  
    結合了系統指令和對話歷史。  
    """  
    history = []  
    messages = st.session_state.get("messages", [])  

    # 修正: 排除最後一條訊息 (當前的 user_prompt)，因為它將作為獨立的訊息發送  
    prior_messages = messages[:-1]  

    # 僅限制過去的對話長度  
    conversation_history = prior_messages[-HISTORY_LENGTH:]  

    for msg in conversation_history:  
        role = "user" if msg["role"] == "user" else "model"  
        # 修正: 將 Part 物件包裝成 Content 物件，符合 start_chat API 的要求  
        history.append(Content(  
            role=role,   
            parts=[Part.from_text(msg["content"])]  
        ))  

    # 建立對話 session  
    # 由於 history 現在是 Content 物件的列表，這個呼叫現在應能正常工作  
    chat = model.start_chat(history=history)  

    # 建立完整的 prompt (結合系統指令)  
    full_prompt = f"{INSTRUCTIONS}\n\nUser Question: {user_prompt}"  

    # 發送訊息並獲取串流回應  
    response_generator = chat.send_message(full_prompt, stream=True)  
    for chunk in response_generator:  
        yield chunk.text  

# --- UI 繪製 ---  

st.title("JH5 的 AI Assistant")  
st.caption("Powered by Google Cloud Run and Vertex AI")  

# 初始化對話歷史  
if "messages" not in st.session_state:  
    st.session_state.messages = []  

# 顯示歷史訊息  
for message in st.session_state.messages:  
    with st.chat_message(message["role"]):  
        st.markdown(message["content"])  

# 接收使用者輸入  
if prompt := st.chat_input("Ask me about Anything..."):  
    # 將使用者訊息加入歷史並顯示  
    st.session_state.messages.append({"role": "user", "content": prompt})  
    with st.chat_message("user"):  
        st.markdown(prompt)  

    # 顯示 AI 回應  
    with st.chat_message("assistant"):  
        try:  
            with st.spinner("Thinking..."):  
                response_stream = get_response_stream(prompt)  
                # 使用 st.write_stream 來處理串流輸出  
                full_response = st.write_stream(response_stream)  

            # 將完整的 AI 回應加入歷史  
            st.session_state.messages.append({"role": "assistant", "content": full_response})  

        except Exception as e:  
            st.error(f"An error occurred: {e}")  

# 新增一個清除對話的按鈕  
if len(st.session_state.messages) > 0:  
    st.button("Clear Conversation", on_click=lambda: st.session_state.clear())

相依套件 (`requirements.txt`)

這裡列出應用程式需要的所有 Python 套件。

# requirements.txt  
streamlit>=1.34.0  
google-cloud-aiplatform>=1.50.0

Dockerfile

雖然Cloud Run也支援原始程式碼直接部署，不過這年代還是養成好習慣，能container image就乖乖地build，Cloud Run 會使用這個 image 來運行相關的demo。

# 使用官方提供的 Python 基礎映像檔  
FROM python:3.11-slim  

# 設定工作目錄  
WORKDIR /app  

# 複製相依套件列表並安裝  
COPY requirements.txt ./requirements.txt  
RUN pip install --no-cache-dir -r requirements.txt  

# 複製應用程式的所有檔案到容器中  
COPY . .  

# 容器啟動時要執行的命令 (已切換為 Shell 格式，以便解析 $PORT 環境變數)  
# Cloud Run 會自動設定 $PORT=8080  
CMD streamlit run main.py --server.port $PORT --server.address 0.0.0.0 --server.enableCORS=false --server.runOnSave=false本地測試設定 (.streamlit/secrets.toml)

要記得不應該把這個檔案被提交到你的 git repository。

# .streamlit/secrets.toml  
[gcp]  
project_id = "your-gcp-project-id"  # 換成你的 GCP 專案 ID  
region = "asia-east1"               # 就是台灣～

Gcloud CLI 安裝

預設Codespace 中應該是沒有gcloud cli ，這邊我們參考官方的安裝小步驟如下 :

# To download the Linux archive file, run the following command:  
curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-linux-x86_64.tar.gz  

# To extract the contents of the file to your file system (preferably to your home directory), run the following command:  
tar -xf google-cloud-cli-linux-x86_64.tar.gz  

# Install & Init  
./google-cloud-sdk/install.sh  
./google-cloud-sdk/bin/gcloud init

部署到 GCP Cloud Run

假設你已經安裝了 gcloud CLI 並且完成了認證。

啟用必要的 GCP APIs

在你的 GCP 專案中，需要啟用 Cloud Run, Artifact Registry (用來存放 Docker image), 和 Vertex AI 的 API。

gcloud services enable run.googleapis.com \  
    artifactregistry.googleapis.com \  
    aiplatform.googleapis.com

2. 建立一個 Artifact Registry Repository

我們需要一個地方來存放我們的 Docker image。

gcloud artifacts repositories create my-streamlit-apps \  
    --repository-format=docker \  
    --location=asia-east1

3. 建立並推送 Docker Image

在你的專案根目錄 (gcp-ai-assistant/) 下，執行以下指令。

# 取得你的 GCP 專案 ID  
export PROJECT_ID=$(gcloud config get-value project)  

# 定義你的映像檔名稱  
export IMAGE_NAME="asia-east1-docker.pkg.dev/${PROJECT_ID}/my-streamlit-apps/ai-assistant:latest"  

# 使用 Cloud Build 來建立並推送 Docker 映像檔 (推薦，不需在本機安裝 Docker)  
gcloud builds submit --tag ${IMAGE_NAME}  

# 如果你不是在codespace上執行，也可以用自己開發環境上的Docker來build   
# gcloud auth configure-docker asia-east1-docker.pkg.dev  
# docker build -t ${IMAGE_NAME} .  
# docker push ${IMAGE_NAME}

這邊大約要等候3–5分鐘的時間。

4：部署到 Cloud Run

最後，執行部署指令。

# 取得你的映像檔名稱  
export PROJECT_ID=$(gcloud config get-value project)  
export IMAGE_NAME="asia-east1-docker.pkg.dev/${PROJECT_ID}/my-streamlit-apps/ai-assistant:latest"  

gcloud run deploy streamlit-ai-assistant \  
    --image=${IMAGE_NAME} \  
    --platform=managed \  
    --region=asia-east1 \  
    --allow-unauthenticated \  
    --set-env-vars="GCP_PROJECT=${PROJECT_ID}" \  
    --set-env-vars="GCP_REGION=asia-east1"

部署完成後，gcloud 會給你一個 URL，點開它，你就能看到在GCP上運行的 AI Assistant 了(如同一開始的封面圖片)

小記I，Vertex AI的目前可以使用的模型，依照不同地區現在有不同的版本ＩＤ可供取用，一開始踹的時候選了台灣，但是有好幾個模型ID都不支援ＱＱ，需要選用不同模型的可以參考一下Model versions and lifecycle

小記II，雖然現在GCP Logs Explorer 跟以前比起來已經又進化了，也直接在log紀錄上新增了Investigate的按鈕，並直接串接Cloud Assist Preview版，不過感覺後面的腦袋還是 gemini-1.5 的早期版本，還是有點呆呆的，也不能將同時將前後幾筆log一起丟到Cloud Assist中…