March 10, 2026
Ken Suzuki
Technology

How to Fetch Google Analytics 4 Data Automatically with Python

A step-by-step guide to calling the GA4 Data API using service account authentication and automating analytics report generation with Python. Covers everything from Google Cloud setup to running the data-fetch script.

Google Analytics 4PythonGA4 APIGoogle CloudAutomation
How to Fetch Google Analytics 4 Data Automatically with Python

How to Fetch Google Analytics 4 Data Automatically with Python

1. Introduction

Checking Google Analytics 4 (GA4) data in the browser every time can be tedious. By calling the GA4 Data API from a Python script, you can automate regular report generation. This article walks through the entire process — from setting up a Google Cloud service account to fetching data with Python — based on a setup that actually works.


2. Overview

  1. Enable the API in Google Cloud
  2. Create a service account and download the credentials (JSON)
  3. Add the service account as a Viewer in your GA4 property
  4. Set up the Python environment and run the script

3. Google Cloud Configuration

3-1. Enable the Google Analytics Data API

  1. Go to Google Cloud Console
  2. Select your project (or create a new one)
  3. Search for "Google Analytics Data API" in the top search bar
  4. Click "Enable"

The API may not take effect immediately after enabling. Wait 2–3 minutes before proceeding.

3-2. Create a Service Account

  1. IAM & Admin → Service Accounts → Create Service Account
  2. Name: ga-report-reader (any name works)
  3. Role: Viewer
  4. After creation, click the service account → Keys → Add Key → JSON
  5. Save the downloaded JSON file in a secure location
mkdir -p ~/.config
mv ~/Downloads/[downloaded-filename].json ~/.config/ga-credentials.json

3-3. Add the Service Account to Your GA4 Property

  1. Google AnalyticsAdmin → Property Access Management
  2. Click "+" → Add Users
  3. Enter the service account email (ga-report-reader@xxx.iam.gserviceaccount.com)
  4. Role: Viewer

3-4. Find Your Property ID

Go to Admin → Property Settings → Property ID (numbers only, e.g. 123456789) and note it down.


4. Python Environment Setup

mkdir ~/ga-reports && cd ~/ga-reports
python3 -m venv venv
source venv/bin/activate
pip install google-analytics-data pandas

5. Report Script (report.py)

Below is the script used in practice. It generates two CSV reports: page-level performance and traffic sources.

"""
GA4 Weekly Report Generator

Usage:
  python report.py

Environment variables:
  GA_PROPERTY_ID                  : GA4 Property ID (numbers only)
  GOOGLE_APPLICATION_CREDENTIALS  : Path to service account JSON
"""

import os
import csv
from datetime import datetime
from pathlib import Path

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import (
    DateRange,
    Dimension,
    Metric,
    OrderBy,
    RunReportRequest,
)

# -----------------------------------------------
# Configuration
# -----------------------------------------------
PROPERTY_ID = os.environ.get("GA_PROPERTY_ID", "YOUR_PROPERTY_ID")
DAYS = 30  # Reporting period (days)
OUTPUT_DIR = Path(__file__).parent / "output"


# -----------------------------------------------
# GA4 Data Fetching
# -----------------------------------------------
def fetch_report(client, property_id: str, days: int) -> list[dict]:
    """Fetch page-level performance report."""
    request = RunReportRequest(
        property=f"properties/{property_id}",
        date_ranges=[DateRange(start_date=f"{days}daysAgo", end_date="today")],
        dimensions=[
            Dimension(name="pagePath"),
            Dimension(name="pageTitle"),
        ],
        metrics=[
            Metric(name="screenPageViews"),
            Metric(name="activeUsers"),
            Metric(name="averageSessionDuration"),
            Metric(name="bounceRate"),
        ],
        order_bys=[
            OrderBy(
                metric=OrderBy.MetricOrderBy(metric_name="screenPageViews"),
                desc=True,
            )
        ],
        limit=50,
    )
    response = client.run_report(request)

    rows = []
    for row in response.rows:
        rows.append(
            {
                "page_path": row.dimension_values[0].value,
                "page_title": row.dimension_values[1].value,
                "views": int(row.metric_values[0].value),
                "active_users": int(row.metric_values[1].value),
                "avg_engagement_sec": round(float(row.metric_values[2].value), 1),
                "bounce_rate": round(float(row.metric_values[3].value) * 100, 1),
            }
        )
    return rows


def fetch_traffic_sources(client, property_id: str, days: int) -> list[dict]:
    """Fetch traffic source report."""
    request = RunReportRequest(
        property=f"properties/{property_id}",
        date_ranges=[DateRange(start_date=f"{days}daysAgo", end_date="today")],
        dimensions=[
            Dimension(name="sessionDefaultChannelGroup"),
            Dimension(name="sessionSource"),
        ],
        metrics=[
            Metric(name="sessions"),
            Metric(name="activeUsers"),
            Metric(name="bounceRate"),
        ],
        order_bys=[
            OrderBy(
                metric=OrderBy.MetricOrderBy(metric_name="sessions"),
                desc=True,
            )
        ],
        limit=20,
    )
    response = client.run_report(request)

    rows = []
    for row in response.rows:
        rows.append(
            {
                "channel": row.dimension_values[0].value,
                "source": row.dimension_values[1].value,
                "sessions": int(row.metric_values[0].value),
                "active_users": int(row.metric_values[1].value),
                "bounce_rate": round(float(row.metric_values[2].value) * 100, 1),
            }
        )
    return rows


# -----------------------------------------------
# Output
# -----------------------------------------------
def save_csv(rows: list[dict], filepath: Path) -> None:
    if not rows:
        print(f"  No data: {filepath.name}")
        return
    with open(filepath, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=rows[0].keys())
        writer.writeheader()
        writer.writerows(rows)
    print(f"  Saved: {filepath}")


def print_summary(page_rows: list[dict], source_rows: list[dict]) -> None:
    total_views = sum(r["views"] for r in page_rows)
    total_users = sum(r["active_users"] for r in page_rows)

    print("\n" + "=" * 50)
    print(f"  Total Views:    {total_views}")
    print(f"  Active Users:   {total_users}")
    print()

    print("  [Top 5 Pages]")
    for r in page_rows[:5]:
        print(
            f"  {r['views']:>3}views  {r['avg_engagement_sec']:>6.1f}s  "
            f"bounce {r['bounce_rate']:>5.1f}%  {r['page_path']}"
        )

    print()
    print("  [Top 5 Traffic Sources]")
    for r in source_rows[:5]:
        print(
            f"  {r['sessions']:>3}sessions  {r['channel']} / {r['source']}"
        )
    print("=" * 50 + "\n")


# -----------------------------------------------
# Main
# -----------------------------------------------
def main():
    if PROPERTY_ID == "YOUR_PROPERTY_ID":
        print("Error: Please set the GA_PROPERTY_ID environment variable")
        print("  e.g. export GA_PROPERTY_ID=123456789")
        return

    OUTPUT_DIR.mkdir(exist_ok=True)
    date_str = datetime.now().strftime("%Y-%m-%d")

    print(f"\nGenerating GA4 report... (last {DAYS} days)\n")

    client = BetaAnalyticsDataClient()

    print("  Fetching page performance...")
    page_rows = fetch_report(client, PROPERTY_ID, DAYS)
    save_csv(page_rows, OUTPUT_DIR / f"{date_str}_pages.csv")

    print("  Fetching traffic sources...")
    source_rows = fetch_traffic_sources(client, PROPERTY_ID, DAYS)
    save_csv(source_rows, OUTPUT_DIR / f"{date_str}_sources.csv")

    print_summary(page_rows, source_rows)
    print(f"Output directory: {OUTPUT_DIR}/")


if __name__ == "__main__":
    main()

6. Running the Script

Set the environment variables and run the script.

export GOOGLE_APPLICATION_CREDENTIALS=~/.config/ga-credentials.json
export GA_PROPERTY_ID=123456789   # Replace with your actual property ID
python report.py

When it runs successfully, the following CSVs will be generated in the output/ folder:

output/
├── 2026-03-10_pages.csv     # Page performance (views, engagement, bounce rate)
└── 2026-03-10_sources.csv   # Traffic sources (channel, session count)

7. Common Errors and Fixes

Error Cause Fix
File ... was not found Incorrect path to credentials JSON Verify the GOOGLE_APPLICATION_CREDENTIALS path
403 PERMISSION_DENIED (SERVICE_DISABLED) API not enabled Enable the API in Cloud Console and wait a few minutes
403 PERMISSION_DENIED (USER_PERMISSION_DENIED) Service account not added to GA4 Add the service account as a Viewer in GA4 Property Access Management

8. Conclusion

Once the Google Cloud service account is configured, a single Python script is all you need to automatically fetch GA4 data. The exported CSVs slot neatly into an AI-powered analysis workflow, and setting up a cron job for monthly automation makes the whole process nearly maintenance-free.

How to Fetch Google Analytics 4 Data Automatically with Python | Shirokuma.online