---------------------------------------------------------------------------
SVM - Support Vector Machine Classifier for AmigaOS
====================================================
DESCRIPTION
SVM is a command-line binary classifier based on the Support Vector
Machine algorithm. It uses the Sequential Minimal Optimisation (SMO)
training algorithm with a linear kernel. Given a CSV file of labelled
training data it will compute a model and save it to disk. The saved
model can then be used to classify new data instantly.
The code is written in strict C89 and compiled with the Storm C
compiler using IEEE floating point. No third-party libraries are required
beyond the standard mathieeedoubbas.library supplied with AmigaOS.
An example training dataset of Viking Artefacts is
included (viking_artefacts.csv), with a binary label indicating whether
a given item is prior to the viking age or not. This dataset can be used to verify the
program works correctly before training on your own data.
FEATURES
- Binary classification (two-class problems)
- Linear kernel SVM via the SMO algorithm
- Reads training data from comma-separated (CSV) files
- Saves and loads trained models to/from plain text files
- Classifies a new sample from a comma-separated argument string
- Accepts labels as 0/1 or -1/+1 interchangeably
- Skips comment lines (starting with #) and blank lines in CSV files
- Configurable regularisation (C), tolerance, and convergence parameters
- No external dependencies beyond standard AmigaOS libraries
REQUIREMENTS
- AmigaOS 2.0 or higher
- 68000 processor or better
- mathieeedoubbas.library (included with AmigaOS)
- Storm C compiler (to recompile from source)
Note: Training is computationally intensive on machines without an FPU.
On a 68020 at 7MHz expect training times of several hours for datasets
of a few hundred samples. Classification of a single sample is fast
regardless of hardware, typically well under one second.
USAGE
Training a model:
svm train <datafile.csv> <modelfile.svm>
Classifying a sample:
svm classify <modelfile.svm> <x1,x2,...,xn>
Examples:
svm train viking_artefacts.csv viking_artefacts.svm
svm classify viking_artefacts.svm 12.0,74.0,16.0,0.0,35.0
CSV TRAINING FILE FORMAT
Each line contains one sample. The last column is the class label
(0 or 1, or -1 or +1). All other columns are numeric feature values.
Columns are separated by commas. Whitespace around values is ignored.
Lines beginning with # are treated as comments and skipped.
Example:
silver_pct, copper_pct, lead_pct, burial_depth_cm, weight_g, label
12.0, 74.0, 16.0, 0.0, 35.0, 1
3.5, 87.0, 22.0, 5.2, 98.0, 0
The number of features is detected automatically from the first valid
data row. All subsequent rows must have the same number of features.
MODEL FILE FORMAT
Model files are plain text and human readable. They record the bias
term, the number of support vectors, and the full feature vector and
alpha coefficient for each support vector. Model files can be copied
between machines freely.
INCLUDED FILES
svm Main command line code
viking_artefacts.csv Example training dataset (200 samples)
viking_artefacts.svm Pretrained model from viking_artefacts.csv
svm.readme This file
LIMITATIONS
- Binary classification only (two classes). Multi-class problems require
multiple models trained in a one-vs-rest arrangement.
- Linear kernel only. Non-linearly separable problems may give poor
accuracy without feature engineering.
- No cross-validation or accuracy reporting built in.
VERSION HISTORY
1.0 Initial release
DISCLAIMER
This software is provided as-is without warranty of any kind. The author
accepts no responsibility for any loss or damage arising from its use.
Use entirely at your own risk.
---------------------------------------------------------------------------
|