# Installation The SPIDB package can be installed using pip directly from GitHub. The package includes both the core database functionality and an optional command-line interface for downloading datasets. ## Quick Installation ### Basic Installation Install the core package without CLI support: ```bash pip install git+https://github.com/dkadyrov/spidb.git ``` ### Installation with CLI Support For the complete experience including the command-line tools: ```bash pip install "git+https://github.com/dkadyrov/spidb.git#egg=spidb[cli]" ``` This installs the additional `kaggle` package required for downloading datasets. ## Installation from Source For development or to access the latest features: ```bash git clone https://github.com/dkadyrov/spidb.git cd spidb pip install -e ".[cli]" ``` The `-e` flag installs in editable mode, allowing you to modify the source code. ## Requirements ### Core Dependencies - **Python** >= 3.11 - **SONICDB** - Sound Organization and Network Integration for Collection/Collaboration Database - Available at: [github.com/dkadyrov/sonicdb](https://github.com/dkadyrov/sonicdb) - Automatically installed as a dependency - **SQLAlchemy** - Database ORM (via SONICDB) - **pandas** - Data manipulation (via SONICDB) ### Optional Dependencies - **kaggle** - For downloading datasets from Kaggle (installed with `[cli]`) - Requires Kaggle API credentials ## Setting Up Kaggle Credentials To use the `spidb download` command, you need Kaggle API credentials: 1. Create a Kaggle account at [kaggle.com](https://www.kaggle.com) 2. Go to your account settings: https://www.kaggle.com/settings/account 3. Scroll to the "API" section 4. Click "Create New Token" 5. Save the downloaded `kaggle.json` file ### Windows ```powershell mkdir $env:USERPROFILE\.kaggle Move-Item .\kaggle.json $env:USERPROFILE\.kaggle\ ``` ### Linux/Mac ```bash mkdir -p ~/.kaggle mv kaggle.json ~/.kaggle/ chmod 600 ~/.kaggle/kaggle.json ``` ## Verifying Installation Check that SPIDB is installed correctly: ```bash # Check version python -c "import spidb; print(spidb.__version__)" # Check CLI (if installed with [cli]) spidb --version ``` ## Project Structure ``` spidb/ ├── spidb/ │ ├── __init__.py │ ├── spidb.py # Core database models │ ├── cli.py # Command-line interface │ └── build.py # Database building functions ├── docs/ # Documentation ├── examples/ # Example scripts ├── scripts/ # Utility scripts └── pyproject.toml # Package configuration ``` ## Troubleshooting ### Import Errors If you encounter import errors: ```bash # Reinstall with dependencies pip install --force-reinstall "git+https://github.com/dkadyrov/spidb.git#egg=spidb[cli]" ``` ### Kaggle Download Issues If `spidb download` fails: 1. Verify credentials are in `~/.kaggle/kaggle.json` 2. Check file permissions: `chmod 600 ~/.kaggle/kaggle.json` (Linux/Mac) 3. Ensure you've accepted dataset terms on Kaggle website 4. Check internet connection ### Database Connection Issues For database errors: ```python # Test basic connectivity from spidb import Database db = Database("test.db") # Creates new SQLite database print("Database connection successful!") ``` ## Next Steps - See [Usage](usage.md) for Python API examples - See [CLI Guide](cli_guide.md) for command-line usage - See [Database](database.md) for schema details - See [Models](models.md) for data model reference