Compare commits

...

8 Commits

8 changed files with 176 additions and 33 deletions

View File

@ -1,7 +1,7 @@
[book]
title = "RustFrame User Guide"
author = ["RustFrame Contributors"]
description = "Guided journey through RustFrame capabilities."
title = "Rustframe User Guide"
authors = ["Palash Tyagi (https://github.com/Magnus167)"]
description = "Guided journey through Rustframe capabilities."
[build]
build-dir = "book"

View File

@ -1,5 +1,5 @@
#!/usr/bin/env sh
# Build and test the RustFrame user guide using mdBook.
# Build and test the Rustframe user guide using mdBook.
set -e
cd docs

View File

@ -11,4 +11,4 @@ mdbook test -L ../target/debug/deps "$@"
mdbook build "$@"
cargo build
cargo build --release
# cargo build --release

View File

@ -1,30 +1,54 @@
# Compute Features
The `compute` module provides statistical routines like descriptive
statistics and correlation measures.
The `compute` module hosts numerical routines for exploratory data analysis.
It covers descriptive statistics, correlations, probability distributions and
some basic inferential tests.
## Basic Statistics
```rust
# extern crate rustframe;
use rustframe::compute::stats::{mean, stddev};
use rustframe::compute::stats::{mean, mean_vertical, stddev, median};
use rustframe::matrix::Matrix;
let m = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
let mean_val = mean(&m);
let std_val = stddev(&m);
assert_eq!(mean(&m), 2.5);
assert_eq!(stddev(&m), 1.118033988749895);
assert_eq!(median(&m), 2.5);
// column averages returned as 1 x n matrix
let col_means = mean_vertical(&m);
assert_eq!(col_means.data(), &[1.5, 3.5]);
```
## Correlation
Correlation functions help measure linear relationships between datasets.
```rust
# extern crate rustframe;
use rustframe::compute::stats::pearson;
use rustframe::compute::stats::{pearson, covariance};
use rustframe::matrix::Matrix;
let x = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
let y = Matrix::from_vec(vec![2.0, 4.0, 6.0, 8.0], 2, 2);
let corr = pearson(&x, &y);
let cov = covariance(&x, &y);
assert!((corr - 1.0).abs() < 1e-8);
assert!((cov - 2.5).abs() < 1e-8);
```
## Distributions
Probability distribution helpers are available for common PDFs and CDFs.
```rust
# extern crate rustframe;
use rustframe::compute::stats::distributions::normal_pdf;
use rustframe::matrix::Matrix;
let x = Matrix::from_vec(vec![0.0, 1.0], 1, 2);
let pdf = normal_pdf(x, 0.0, 1.0);
assert_eq!(pdf.data().len(), 2);
```
With the basics covered, explore predictive models in the

View File

@ -1,7 +1,8 @@
# Data Manipulation
RustFrame's `Frame` type couples tabular data with
column labels and a typed row index.
Rustframe's `Frame` type couples tabular data with
column labels and a typed row index. Frames expose a familiar API for loading
data, selecting rows or columns and performing aggregations.
## Creating a Frame
@ -17,27 +18,60 @@ assert_eq!(frame["A"], vec![1.0, 2.0]);
## Indexing Rows
Row labels can be integers, dates or a default range. Retrieving a row returns a
view that lets you inspect values by column name or position.
```rust
# extern crate rustframe;
# extern crate chrono;
use chrono::NaiveDate;
use rustframe::frame::{Frame, RowIndex};
use rustframe::matrix::Matrix;
let d = |y, m, d| NaiveDate::from_ymd_opt(y, m, d).unwrap();
let data = Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]);
let index = RowIndex::Date(vec![d(2024, 1, 1), d(2024, 1, 2)]);
let mut frame = Frame::new(data, vec!["A", "B"], Some(index));
assert_eq!(frame.get_row_date(d(2024, 1, 2))["B"], 4.0);
// mutate by row key
frame.get_row_date_mut(d(2024, 1, 1)).set_by_index(0, 9.0);
assert_eq!(frame.get_row_date(d(2024, 1, 1))["A"], 9.0);
```
## Column operations
Columns can be inserted, renamed, removed or reordered in place.
```rust
# extern crate rustframe;
use rustframe::frame::{Frame, RowIndex};
use rustframe::matrix::Matrix;
let data = Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]);
let index = RowIndex::Int(vec![10, 20]);
let frame = Frame::new(data, vec!["A", "B"], Some(index));
assert_eq!(frame.get_row(20)["B"], 4.0);
let data = Matrix::from_cols(vec![vec![1, 2], vec![3, 4]]);
let mut frame = Frame::new(data, vec!["X", "Y"], Some(RowIndex::Range(0..2)));
frame.add_column("Z", vec![5, 6]);
frame.rename("Y", "W");
let removed = frame.delete_column("X");
assert_eq!(removed, vec![1, 2]);
frame.sort_columns();
assert_eq!(frame.columns(), &["W", "Z"]);
```
## Aggregations
Any numeric aggregation available on `Matrix` is forwarded to `Frame`.
```rust
# extern crate rustframe;
use rustframe::frame::Frame;
use rustframe::matrix::{Matrix, SeriesOps};
let frame = Frame::new(Matrix::from_cols(vec![vec![1.0, 2.0]]), vec!["A"], None);
assert_eq!(frame.sum_vertical(), vec![3.0]);
let frame = Frame::new(Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]), vec!["A", "B"], None);
assert_eq!(frame.sum_vertical(), vec![3.0, 7.0]);
assert_eq!(frame.sum_horizontal(), vec![4.0, 6.0]);
```
When you're ready to run analytics, continue to the
[compute features](./compute.md) chapter.
With the basics covered, continue to the [compute features](./compute.md)
chapter for statistics and analytics.

View File

@ -1,10 +1,38 @@
# Introduction
Welcome to the **RustFrame User Guide**. This book provides a tour of
RustFrame's capabilities from basic data handling to advanced machine learning
workflows. Each chapter contains runnable snippets so you can follow along.
Welcome to the **Rustframe User Guide**. Rustframe is a lightweight dataframe
and math toolkit for Rust written in 100% safe Rust. It focuses on keeping the
API approachable while offering handy features for small analytical or
educational projects.
1. [Data manipulation](./data-manipulation.md) for loading and transforming data.
2. [Compute features](./compute.md) for statistics and analytics.
3. [Machine learning](./machine-learning.md) for predictive models.
4. [Utilities](./utilities.md) for supporting helpers and upcoming modules.
Rustframe bundles:
- columnlabelled frames built on a fast columnmajor matrix
- familiar elementwise math and aggregation routines
- a growing `compute` module for statistics and machine learning
- utilities for dates and random numbers
```rust
# extern crate rustframe;
use rustframe::{frame::Frame, matrix::{Matrix, SeriesOps}};
let data = Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]);
let frame = Frame::new(data, vec!["A", "B"], None);
// Perform column wise aggregation
assert_eq!(frame.sum_vertical(), vec![3.0, 7.0]);
```
## Resources
- [GitHub repository](https://github.com/Magnus167/rustframe)
- [Crates.io](https://crates.io/crates/rustframe) & [API docs](https://docs.rs/rustframe)
- [Code coverage](https://codecov.io/gh/Magnus167/rustframe)
This guide walks through the main building blocks of the library. Each chapter
contains runnable snippets so you can follow along:
1. [Data manipulation](./data-manipulation.md) for loading and transforming data
2. [Compute features](./compute.md) for statistics and analytics
3. [Machine learning](./machine-learning.md) for predictive models
4. [Utilities](./utilities.md) for supporting helpers and upcoming modules

View File

@ -1,11 +1,17 @@
# Machine Learning
RustFrame ships with several algorithms:
The `compute::models` module bundles several learning algorithms that operate on
`Matrix` structures. These examples highlight the basic training and prediction
APIs. For more endtoend walkthroughs see the examples directory in the
repository.
Currently implemented models include:
- Linear and logistic regression
- K-means clustering
- Kmeans clustering
- Principal component analysis (PCA)
- Naive Bayes and dense neural networks
- Gaussian Naive Bayes
- Dense neural networks
## Linear Regression
@ -37,3 +43,34 @@ let cluster = model.predict(&new_point)[0];
For helper functions and upcoming modules, visit the
[utilities](./utilities.md) section.
## Logistic Regression
```rust
# extern crate rustframe;
use rustframe::compute::models::logreg::LogReg;
use rustframe::matrix::Matrix;
let x = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 4, 1);
let y = Matrix::from_vec(vec![0.0, 0.0, 1.0, 1.0], 4, 1);
let mut model = LogReg::new(1);
model.fit(&x, &y, 0.1, 200);
let preds = model.predict_proba(&x);
assert_eq!(preds.rows(), 4);
```
## Principal Component Analysis
```rust
# extern crate rustframe;
use rustframe::compute::models::pca::PCA;
use rustframe::matrix::Matrix;
let data = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
let pca = PCA::fit(&data, 1, 0);
let transformed = pca.transform(&data);
assert_eq!(transformed.cols(), 1);
```
For helper functions and upcoming modules, visit the
[utilities](./utilities.md) section.

View File

@ -3,16 +3,36 @@
Utilities provide handy helpers around the core library. Existing tools
include:
- Date utilities for generating calendar sequences.
- Date utilities for generating calendar sequences and businessday sets
- Random number generators for simulations and testing
## Date Helpers
```rust
# extern crate rustframe;
use rustframe::utils::dateutils::{DatesList, DateFreq};
use rustframe::utils::dateutils::{BDatesList, BDateFreq, DatesList, DateFreq};
// Calendar sequence
let list = DatesList::new("2024-01-01".into(), "2024-01-03".into(), DateFreq::Daily);
assert_eq!(list.count().unwrap(), 3);
// Business days starting from 20240102
let bdates = BDatesList::from_n_periods("2024-01-02".into(), BDateFreq::Daily, 3).unwrap();
assert_eq!(bdates.list().unwrap().len(), 3);
```
## Random Numbers
The `random` module offers deterministic and cryptographically secure RNGs.
```rust
# extern crate rustframe;
use rustframe::random::{Prng, Rng};
let mut rng = Prng::new(42);
let v1 = rng.next_u64();
let v2 = rng.next_u64();
assert_ne!(v1, v2);
```
Upcoming utilities will cover: