Folder Skeleton Proposal: Core + Market Structure MCPs

Folder Skeleton Proposal: Core + Market Structure MCPs

This document proposes a Python-focused folder structure for the Core Government Data MCP and Market Structure MCP, designed to be portable to TypeScript/Node if needed.

Design Principles

  • Separation of Concerns: Core = raw data access, Market Structure = analytical joins
  • Reusability: Shared utilities in core/ for normalization, mappings, schemas
  • Testability: Clear module boundaries, fixtures, and test structure
  • Extensibility: Easy to add new domain MCPs following the same pattern

Proposed Structure

mcp-servers/
├── core/                          # Shared core utilities (NEW)
│   ├── __init__.py
│   ├── schemas/                   # Canonical data schemas
│   │   ├── __init__.py
│   │   ├── naics.py               # NAICS code normalization, validation
│   │   ├── geography.py           # Geography code normalization, validation
│   │   ├── time.py                # Time alignment, year range validation
│   │   ├── money.py               # Currency, inflation adjustment
│   │   └── units.py               # Unit conversion utilities
│   ├── normalization/             # Data normalization utilities
│   │   ├── __init__.py
│   │   ├── mappings.py            # NAICS/geography concordances
│   │   ├── inflation.py           # Inflation adjustment (BLS CPI)
│   │   └── concordances.py        # Cross-agency code mappings
│   ├── connectors/                # Raw API connectors (extracted from gov-data)
│   │   ├── __init__.py
│   │   ├── census/                # Census API clients
│   │   │   ├── __init__.py
│   │   │   ├── bds_client.py
│   │   │   ├── abs_client.py
│   │   │   ├── cbp_client.py
│   │   │   └── acs_client.py
│   │   ├── bea/                   # BEA API client
│   │   │   ├── __init__.py
│   │   │   └── bea_client.py
│   │   ├── bls/                   # BLS API clients
│   │   │   ├── __init__.py
│   │   │   ├── oes_client.py
│   │   │   ├── ces_client.py
│   │   │   └── ppi_client.py
│   │   ├── fred/                  # FRED API client
│   │   │   ├── __init__.py
│   │   │   └── fred_client.py
│   │   └── treasury/              # Treasury API client (future)
│   │       ├── __init__.py
│   │       └── treasury_client.py
│   └── provenance/                # Provenance tracking
│       ├── __init__.py
│       ├── metadata.py            # Source metadata builder
│       └── citation.py             # Citation generator
│
├── government-data/               # Core Government Data MCP (EXISTING - rebranded)
│   ├── server.py                  # MCP server (wraps connectors)
│   ├── tools/                     # MCP tool wrappers (NEW)
│   │   ├── __init__.py
│   │   ├── census_tools.py        # fetch_raw_census_* tools
│   │   ├── bea_tools.py           # fetch_raw_bea_* tools
│   │   ├── bls_tools.py           # fetch_raw_bls_* tools
│   │   └── fred_tools.py          # fetch_raw_fred_* tools
│   ├── adapters/                  # Adapters from connectors to MCP tools (NEW)
│   │   ├── __init__.py
│   │   └── mcp_adapter.py         # Converts connector responses to MCP format
│   ├── census_client.py           # (DEPRECATED - move to core/connectors/)
│   ├── bls_client.py              # (DEPRECATED - move to core/connectors/)
│   ├── fred_client.py             # (DEPRECATED - move to core/connectors/)
│   ├── bea_client.py              # (DEPRECATED - move to core/connectors/)
│   ├── cache.py                   # Caching utilities
│   ├── errors.py                  # Error handling
│   ├── requirements.txt
│   ├── README.md                  # (UPDATED - raw-only positioning)
│   └── SETUP.md
│
├── market-structure/              # Market Structure MCP (EXISTING - enhanced)
│   ├── server.py                  # MCP server
│   ├── tools/                     # MCP tool implementations (NEW)
│   │   ├── __init__.py
│   │   ├── market_size.py         # get_market_size implementation
│   │   ├── firm_counts.py         # get_firm_counts implementation
│   │   ├── entry_exit.py          # get_entry_exit_rates implementation
│   │   ├── fragmentation.py       # get_market_fragmentation implementation
│   │   └── reachability.py        # get_market_reachability implementation
│   ├── logic/                     # Analytical logic (NEW)
│   │   ├── __init__.py
│   │   ├── joins.py               # Data joining logic (Census + BEA)
│   │   ├── fragmentation.py       # HHI, concentration calculations
│   │   ├── reachability.py        # Addressable market calculations
│   │   └── aggregations.py        # Firm size aggregations
│   ├── clients.py                 # Wrapper for Core Gov MCP (EXISTING)
│   ├── market_analyzer.py         # (REFACTOR - use tools/ + logic/)
│   ├── cache/                     # Caching layer (NEW)
│   │   ├── __init__.py
│   │   └── cache_manager.py       # Cache for expensive computations
│   ├── tests/                     # Test suite
│   │   ├── __init__.py
│   │   ├── fixtures/              # Test data fixtures
│   │   │   ├── __init__.py
│   │   │   └── sample_responses.py
│   │   ├── test_market_size.py
│   │   ├── test_firm_counts.py
│   │   ├── test_entry_exit.py
│   │   ├── test_fragmentation.py
│   │   └── test_reachability.py
│   ├── requirements.txt
│   └── README.md                  # (EXISTING - already good)
│
└── shared/                        # Shared utilities (EXISTING)
    ├── __init__.py
    ├── errors.py                  # Common error types
    ├── logging.py                 # Logging utilities
    └── utils.py                   # General utilities

Key Files and Their Roles

Core Layer (core/)

Purpose: Shared primitives used by all MCPs.

core/schemas/naics.py

  • NAICS code normalization (strip leading zeros, validate)
  • NAICS hierarchy lookup (parent/child codes)
  • NAICS code validation

core/schemas/geography.py

  • Geography code normalization ("state:06""state:06")
  • Geography validation (FIPS codes, CBSA codes)
  • Geography hierarchy (state → county → MSA)

core/schemas/time.py

  • Year validation (check against dataset availability)
  • Year range validation (start_year <= end_year)
  • Time alignment utilities

core/normalization/inflation.py

  • Inflation adjustment using BLS CPI
  • Base year conversion
  • Currency conversion (if needed)

core/connectors/

  • Raw API clients extracted from government-data/
  • Each connector is a thin wrapper around API calls
  • Returns raw API responses (no interpretation)

Core Government Data MCP (government-data/)

Purpose: Raw data access layer, MCP server wrapping connectors.

government-data/tools/

  • MCP tool wrappers (e.g., fetch_raw_census_bds_data)
  • Each tool calls appropriate connector
  • Formats response for MCP protocol
  • Adds provenance metadata

government-data/adapters/mcp_adapter.py

  • Converts connector responses to MCP tool format
  • Adds error handling
  • Adds provenance envelope

Market Structure MCP (market-structure/)

Purpose: Analytical engine composing Core Gov Data outputs.

market-structure/tools/

  • MCP tool implementations (5 tools)
  • Each tool calls logic/ functions
  • Handles input validation (using core/schemas/)
  • Formats response with provenance

market-structure/logic/joins.py

  • Joins Census ABS + BEA GDP data
  • Handles data suppression
  • Aggregates by NAICS/geography/time

market-structure/logic/fragmentation.py

  • Calculates HHI (Herfindahl-Hirschman Index)
  • Calculates concentration ratios (top 4, top 8)
  • Computes fragmentation score

market-structure/logic/reachability.py

  • Maps firm size categories to CBP categories
  • Aggregates addressable firms/revenue
  • Calculates reachability percentage

Migration Path

Phase 1: Extract Connectors (Week 1)

  1. Create core/connectors/ structure
  2. Move census_client.py, bls_client.py, etc. to core/connectors/
  3. Update imports in government-data/server.py
  4. Test that existing functionality still works

Phase 2: Add Core Utilities (Week 1-2)

  1. Create core/schemas/ modules
  2. Implement NAICS/geography/time normalization
  3. Create core/normalization/inflation.py
  4. Add tests for core utilities

Phase 3: Refactor Government Data MCP (Week 2)

  1. Create government-data/tools/ wrappers
  2. Create government-data/adapters/mcp_adapter.py
  3. Update server.py to use new structure
  4. Ensure all tools use fetch_raw_* naming

Phase 4: Enhance Market Structure MCP (Week 2-3)

  1. Create market-structure/tools/ modules
  2. Extract logic to market-structure/logic/
  3. Update market_analyzer.py to use new structure
  4. Add comprehensive tests

Dependencies

Core Dependencies (all MCPs)

# core/requirements.txt (or pyproject.toml)
requests>=2.31.0
python-dotenv>=1.0.0

Government Data MCP Dependencies

# government-data/requirements.txt
mcp>=0.1.0  # MCP Python SDK
# Plus core dependencies (via relative import or package)

Market Structure MCP Dependencies

# market-structure/requirements.txt
mcp>=0.1.0  # MCP Python SDK
numpy>=1.24.0  # For calculations (HHI, etc.)
# Plus core dependencies

Import Patterns

Core Utilities

# In any MCP
from core.schemas.naics import normalize_naics, validate_naics
from core.schemas.geography import normalize_geography, validate_geography
from core.normalization.inflation import adjust_for_inflation

Connectors

# In government-data MCP
from core.connectors.census.bds_client import BDSClient
from core.connectors.bea.bea_client import BEAClient

Market Structure Logic

# In market-structure MCP
from market_structure.logic.joins import join_abs_bea_data
from market_structure.logic.fragmentation import calculate_hhi

Testing Structure

Unit Tests

  • core/schemas/tests/ - Test normalization, validation
  • core/normalization/tests/ - Test inflation, mappings
  • market-structure/tests/ - Test each tool independently

Integration Tests

  • market-structure/tests/integration/ - Test full tool flows
  • Use fixtures from tests/fixtures/sample_responses.py

Test Fixtures

  • Mock API responses in tests/fixtures/
  • Use pytest fixtures for reusable test data

Documentation Structure

docs/
├── architecture.md                # (CREATED) Overall architecture
├── contracts/
│   └── market_structure.md        # (CREATED) Tool contracts
├── methodology/
│   ├── tam_sam_som.md            # TAM/SAM/SOM definitions
│   ├── fragmentation.md           # HHI methodology
│   └── reachability.md            # Reachability methodology
└── skeleton_proposal.md           # (THIS FILE)

Next Steps

  1. Review this proposal - Validate structure makes sense
  2. Create core/ directory - Start with schemas and connectors
  3. Migrate connectors - Move from government-data/ to core/connectors/
  4. Refactor government-data - Add tools/ and adapters/
  5. Enhance market-structure - Add tools/ and logic/ separation

Questions to Resolve

  1. Package structure: Should core/ be a Python package (pip install core), or relative imports?
    • Recommendation: Start with relative imports, move to package later if needed
  2. Connector extraction: Should connectors be extracted immediately, or kept in government-data/ for now?
    • Recommendation: Extract gradually (Phase 1 migration path)
  3. Shared utilities: Should shared/ be merged into core/?
    • Recommendation: Keep shared/ for now, merge later if it makes sense
  4. Testing framework: Use pytest or unittest?
    • Recommendation: pytest (more modern, better fixtures)

References