Folder Skeleton Proposal: Core + Market Structure MCPs
Folder Skeleton Proposal: Core + Market Structure MCPs
This document proposes a Python-focused folder structure for the Core Government Data MCP and Market Structure MCP, designed to be portable to TypeScript/Node if needed.
Design Principles
- Separation of Concerns: Core = raw data access, Market Structure = analytical joins
- Reusability: Shared utilities in
core/for normalization, mappings, schemas - Testability: Clear module boundaries, fixtures, and test structure
- Extensibility: Easy to add new domain MCPs following the same pattern
Proposed Structure
mcp-servers/
├── core/ # Shared core utilities (NEW)
│ ├── __init__.py
│ ├── schemas/ # Canonical data schemas
│ │ ├── __init__.py
│ │ ├── naics.py # NAICS code normalization, validation
│ │ ├── geography.py # Geography code normalization, validation
│ │ ├── time.py # Time alignment, year range validation
│ │ ├── money.py # Currency, inflation adjustment
│ │ └── units.py # Unit conversion utilities
│ ├── normalization/ # Data normalization utilities
│ │ ├── __init__.py
│ │ ├── mappings.py # NAICS/geography concordances
│ │ ├── inflation.py # Inflation adjustment (BLS CPI)
│ │ └── concordances.py # Cross-agency code mappings
│ ├── connectors/ # Raw API connectors (extracted from gov-data)
│ │ ├── __init__.py
│ │ ├── census/ # Census API clients
│ │ │ ├── __init__.py
│ │ │ ├── bds_client.py
│ │ │ ├── abs_client.py
│ │ │ ├── cbp_client.py
│ │ │ └── acs_client.py
│ │ ├── bea/ # BEA API client
│ │ │ ├── __init__.py
│ │ │ └── bea_client.py
│ │ ├── bls/ # BLS API clients
│ │ │ ├── __init__.py
│ │ │ ├── oes_client.py
│ │ │ ├── ces_client.py
│ │ │ └── ppi_client.py
│ │ ├── fred/ # FRED API client
│ │ │ ├── __init__.py
│ │ │ └── fred_client.py
│ │ └── treasury/ # Treasury API client (future)
│ │ ├── __init__.py
│ │ └── treasury_client.py
│ └── provenance/ # Provenance tracking
│ ├── __init__.py
│ ├── metadata.py # Source metadata builder
│ └── citation.py # Citation generator
│
├── government-data/ # Core Government Data MCP (EXISTING - rebranded)
│ ├── server.py # MCP server (wraps connectors)
│ ├── tools/ # MCP tool wrappers (NEW)
│ │ ├── __init__.py
│ │ ├── census_tools.py # fetch_raw_census_* tools
│ │ ├── bea_tools.py # fetch_raw_bea_* tools
│ │ ├── bls_tools.py # fetch_raw_bls_* tools
│ │ └── fred_tools.py # fetch_raw_fred_* tools
│ ├── adapters/ # Adapters from connectors to MCP tools (NEW)
│ │ ├── __init__.py
│ │ └── mcp_adapter.py # Converts connector responses to MCP format
│ ├── census_client.py # (DEPRECATED - move to core/connectors/)
│ ├── bls_client.py # (DEPRECATED - move to core/connectors/)
│ ├── fred_client.py # (DEPRECATED - move to core/connectors/)
│ ├── bea_client.py # (DEPRECATED - move to core/connectors/)
│ ├── cache.py # Caching utilities
│ ├── errors.py # Error handling
│ ├── requirements.txt
│ ├── README.md # (UPDATED - raw-only positioning)
│ └── SETUP.md
│
├── market-structure/ # Market Structure MCP (EXISTING - enhanced)
│ ├── server.py # MCP server
│ ├── tools/ # MCP tool implementations (NEW)
│ │ ├── __init__.py
│ │ ├── market_size.py # get_market_size implementation
│ │ ├── firm_counts.py # get_firm_counts implementation
│ │ ├── entry_exit.py # get_entry_exit_rates implementation
│ │ ├── fragmentation.py # get_market_fragmentation implementation
│ │ └── reachability.py # get_market_reachability implementation
│ ├── logic/ # Analytical logic (NEW)
│ │ ├── __init__.py
│ │ ├── joins.py # Data joining logic (Census + BEA)
│ │ ├── fragmentation.py # HHI, concentration calculations
│ │ ├── reachability.py # Addressable market calculations
│ │ └── aggregations.py # Firm size aggregations
│ ├── clients.py # Wrapper for Core Gov MCP (EXISTING)
│ ├── market_analyzer.py # (REFACTOR - use tools/ + logic/)
│ ├── cache/ # Caching layer (NEW)
│ │ ├── __init__.py
│ │ └── cache_manager.py # Cache for expensive computations
│ ├── tests/ # Test suite
│ │ ├── __init__.py
│ │ ├── fixtures/ # Test data fixtures
│ │ │ ├── __init__.py
│ │ │ └── sample_responses.py
│ │ ├── test_market_size.py
│ │ ├── test_firm_counts.py
│ │ ├── test_entry_exit.py
│ │ ├── test_fragmentation.py
│ │ └── test_reachability.py
│ ├── requirements.txt
│ └── README.md # (EXISTING - already good)
│
└── shared/ # Shared utilities (EXISTING)
├── __init__.py
├── errors.py # Common error types
├── logging.py # Logging utilities
└── utils.py # General utilities
Key Files and Their Roles
Core Layer (core/)
Purpose: Shared primitives used by all MCPs.
core/schemas/naics.py
- NAICS code normalization (strip leading zeros, validate)
- NAICS hierarchy lookup (parent/child codes)
- NAICS code validation
core/schemas/geography.py
- Geography code normalization (
"state:06"→"state:06") - Geography validation (FIPS codes, CBSA codes)
- Geography hierarchy (state → county → MSA)
core/schemas/time.py
- Year validation (check against dataset availability)
- Year range validation (
start_year <= end_year) - Time alignment utilities
core/normalization/inflation.py
- Inflation adjustment using BLS CPI
- Base year conversion
- Currency conversion (if needed)
core/connectors/
- Raw API clients extracted from
government-data/ - Each connector is a thin wrapper around API calls
- Returns raw API responses (no interpretation)
Core Government Data MCP (government-data/)
Purpose: Raw data access layer, MCP server wrapping connectors.
government-data/tools/
- MCP tool wrappers (e.g.,
fetch_raw_census_bds_data) - Each tool calls appropriate connector
- Formats response for MCP protocol
- Adds provenance metadata
government-data/adapters/mcp_adapter.py
- Converts connector responses to MCP tool format
- Adds error handling
- Adds provenance envelope
Market Structure MCP (market-structure/)
Purpose: Analytical engine composing Core Gov Data outputs.
market-structure/tools/
- MCP tool implementations (5 tools)
- Each tool calls
logic/functions - Handles input validation (using
core/schemas/) - Formats response with provenance
market-structure/logic/joins.py
- Joins Census ABS + BEA GDP data
- Handles data suppression
- Aggregates by NAICS/geography/time
market-structure/logic/fragmentation.py
- Calculates HHI (Herfindahl-Hirschman Index)
- Calculates concentration ratios (top 4, top 8)
- Computes fragmentation score
market-structure/logic/reachability.py
- Maps firm size categories to CBP categories
- Aggregates addressable firms/revenue
- Calculates reachability percentage
Migration Path
Phase 1: Extract Connectors (Week 1)
- Create
core/connectors/structure - Move
census_client.py,bls_client.py, etc. tocore/connectors/ - Update imports in
government-data/server.py - Test that existing functionality still works
Phase 2: Add Core Utilities (Week 1-2)
- Create
core/schemas/modules - Implement NAICS/geography/time normalization
- Create
core/normalization/inflation.py - Add tests for core utilities
Phase 3: Refactor Government Data MCP (Week 2)
- Create
government-data/tools/wrappers - Create
government-data/adapters/mcp_adapter.py - Update
server.pyto use new structure - Ensure all tools use
fetch_raw_*naming
Phase 4: Enhance Market Structure MCP (Week 2-3)
- Create
market-structure/tools/modules - Extract logic to
market-structure/logic/ - Update
market_analyzer.pyto use new structure - Add comprehensive tests
Dependencies
Core Dependencies (all MCPs)
# core/requirements.txt (or pyproject.toml)
requests>=2.31.0
python-dotenv>=1.0.0
Government Data MCP Dependencies
# government-data/requirements.txt
mcp>=0.1.0 # MCP Python SDK
# Plus core dependencies (via relative import or package)
Market Structure MCP Dependencies
# market-structure/requirements.txt
mcp>=0.1.0 # MCP Python SDK
numpy>=1.24.0 # For calculations (HHI, etc.)
# Plus core dependencies
Import Patterns
Core Utilities
# In any MCP
from core.schemas.naics import normalize_naics, validate_naics
from core.schemas.geography import normalize_geography, validate_geography
from core.normalization.inflation import adjust_for_inflation
Connectors
# In government-data MCP
from core.connectors.census.bds_client import BDSClient
from core.connectors.bea.bea_client import BEAClient
Market Structure Logic
# In market-structure MCP
from market_structure.logic.joins import join_abs_bea_data
from market_structure.logic.fragmentation import calculate_hhi
Testing Structure
Unit Tests
core/schemas/tests/- Test normalization, validationcore/normalization/tests/- Test inflation, mappingsmarket-structure/tests/- Test each tool independently
Integration Tests
market-structure/tests/integration/- Test full tool flows- Use fixtures from
tests/fixtures/sample_responses.py
Test Fixtures
- Mock API responses in
tests/fixtures/ - Use
pytestfixtures for reusable test data
Documentation Structure
docs/
├── architecture.md # (CREATED) Overall architecture
├── contracts/
│ └── market_structure.md # (CREATED) Tool contracts
├── methodology/
│ ├── tam_sam_som.md # TAM/SAM/SOM definitions
│ ├── fragmentation.md # HHI methodology
│ └── reachability.md # Reachability methodology
└── skeleton_proposal.md # (THIS FILE)
Next Steps
- Review this proposal - Validate structure makes sense
- Create
core/directory - Start with schemas and connectors - Migrate connectors - Move from
government-data/tocore/connectors/ - Refactor government-data - Add tools/ and adapters/
- Enhance market-structure - Add tools/ and logic/ separation
Questions to Resolve
- Package structure: Should
core/be a Python package (pip install core), or relative imports?- Recommendation: Start with relative imports, move to package later if needed
- Connector extraction: Should connectors be extracted immediately, or kept in
government-data/for now?- Recommendation: Extract gradually (Phase 1 migration path)
- Shared utilities: Should
shared/be merged intocore/?- Recommendation: Keep
shared/for now, merge later if it makes sense
- Recommendation: Keep
- Testing framework: Use
pytestorunittest?- Recommendation:
pytest(more modern, better fixtures)
- Recommendation: