Installation

The SPIDB package can be installed using pip directly from GitHub. The package includes both the core database functionality and an optional command-line interface for downloading datasets.

Quick Installation

Basic Installation

Install the core package without CLI support:

pip install git+https://github.com/dkadyrov/spidb.git

Installation with CLI Support

For the complete experience including the command-line tools:

pip install "git+https://github.com/dkadyrov/spidb.git#egg=spidb[cli]"

This installs the additional kaggle package required for downloading datasets.

Installation from Source

For development or to access the latest features:

git clone https://github.com/dkadyrov/spidb.git
cd spidb
pip install -e ".[cli]"

The -e flag installs in editable mode, allowing you to modify the source code.

Requirements

Core Dependencies

  • Python >= 3.11

  • SONICDB - Sound Organization and Network Integration for Collection/Collaboration Database

  • SQLAlchemy - Database ORM (via SONICDB)

  • pandas - Data manipulation (via SONICDB)

Optional Dependencies

  • kaggle - For downloading datasets from Kaggle (installed with [cli])

    • Requires Kaggle API credentials

Setting Up Kaggle Credentials

To use the spidb download command, you need Kaggle API credentials:

  1. Create a Kaggle account at kaggle.com

  2. Go to your account settings: https://www.kaggle.com/settings/account

  3. Scroll to the “API” section

  4. Click “Create New Token”

  5. Save the downloaded kaggle.json file

Windows

mkdir $env:USERPROFILE\.kaggle
Move-Item .\kaggle.json $env:USERPROFILE\.kaggle\

Linux/Mac

mkdir -p ~/.kaggle
mv kaggle.json ~/.kaggle/
chmod 600 ~/.kaggle/kaggle.json

Verifying Installation

Check that SPIDB is installed correctly:

# Check version
python -c "import spidb; print(spidb.__version__)"

# Check CLI (if installed with [cli])
spidb --version

Project Structure

spidb/
├── spidb/
│   ├── __init__.py
│   ├── spidb.py         # Core database models
│   ├── cli.py           # Command-line interface
│   └── build.py         # Database building functions
├── docs/                 # Documentation
├── examples/            # Example scripts
├── scripts/             # Utility scripts
└── pyproject.toml       # Package configuration

Troubleshooting

Import Errors

If you encounter import errors:

# Reinstall with dependencies
pip install --force-reinstall "git+https://github.com/dkadyrov/spidb.git#egg=spidb[cli]"

Kaggle Download Issues

If spidb download fails:

  1. Verify credentials are in ~/.kaggle/kaggle.json

  2. Check file permissions: chmod 600 ~/.kaggle/kaggle.json (Linux/Mac)

  3. Ensure you’ve accepted dataset terms on Kaggle website

  4. Check internet connection

Database Connection Issues

For database errors:

# Test basic connectivity
from spidb import Database
db = Database("test.db")  # Creates new SQLite database
print("Database connection successful!")

Next Steps

  • See Usage for Python API examples

  • See CLI Guide for command-line usage

  • See Database for schema details

  • See Models for data model reference