Skip to content

google-okf

google-okf is an open-source Python library designed to automatically connect to various enterprise data sources and convert them into the standard Google Open Knowledge Format (OKF).

By standardizing database schemas, collections, documentation, playbooks, and APIs into clean Markdown files with structured YAML frontmatter, google-okf acts as the critical intermediate context-assembly layer for Retrieval-Augmented Generation (RAG) pipelines and agentic AI systems.


🧩 RAG Pipeline Integration Flow

google-okf acts as the ingestion and semantic translation layer in your intelligence stack:

RAG Pipeline Diagram


Getting Started

Installation

Installation Steps

You can easily install the library using pip from PyPI:

pip install google-okf

Or using uv inside your virtual environment:

uv add google-okf


Quickstart (CLI)

google-okf includes a robust CLI tool to manage and build OKF bundles:

1. Initialize a Bundle

Create a blank OKF folder structure with default subdirectories:

google-okf init my_knowledge_bundle

2. Run a Flat Files Producer

Import a directory of PDF/DOCX/Markdown/TXT files into your bundle:

google-okf produce --type files --src-dir ./raw_documents --out-dir my_knowledge_bundle

Verify that all YAML frontmatter compiles and all internal relative markdown links resolve correctly:

google-okf lint my_knowledge_bundle