Compare commits

...

8 Commits

8 changed files with 176 additions and 33 deletions

View File

@ -1,7 +1,7 @@
[book] [book]
title = "RustFrame User Guide" title = "Rustframe User Guide"
author = ["RustFrame Contributors"] authors = ["Palash Tyagi (https://github.com/Magnus167)"]
description = "Guided journey through RustFrame capabilities." description = "Guided journey through Rustframe capabilities."
[build] [build]
build-dir = "book" build-dir = "book"

View File

@ -1,5 +1,5 @@
#!/usr/bin/env sh #!/usr/bin/env sh
# Build and test the RustFrame user guide using mdBook. # Build and test the Rustframe user guide using mdBook.
set -e set -e
cd docs cd docs

View File

@ -11,4 +11,4 @@ mdbook test -L ../target/debug/deps "$@"
mdbook build "$@" mdbook build "$@"
cargo build cargo build
cargo build --release # cargo build --release

View File

@ -1,30 +1,54 @@
# Compute Features # Compute Features
The `compute` module provides statistical routines like descriptive The `compute` module hosts numerical routines for exploratory data analysis.
statistics and correlation measures. It covers descriptive statistics, correlations, probability distributions and
some basic inferential tests.
## Basic Statistics ## Basic Statistics
```rust ```rust
# extern crate rustframe; # extern crate rustframe;
use rustframe::compute::stats::{mean, stddev}; use rustframe::compute::stats::{mean, mean_vertical, stddev, median};
use rustframe::matrix::Matrix; use rustframe::matrix::Matrix;
let m = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2); let m = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
let mean_val = mean(&m); assert_eq!(mean(&m), 2.5);
let std_val = stddev(&m); assert_eq!(stddev(&m), 1.118033988749895);
assert_eq!(median(&m), 2.5);
// column averages returned as 1 x n matrix
let col_means = mean_vertical(&m);
assert_eq!(col_means.data(), &[1.5, 3.5]);
``` ```
## Correlation ## Correlation
Correlation functions help measure linear relationships between datasets.
```rust ```rust
# extern crate rustframe; # extern crate rustframe;
use rustframe::compute::stats::pearson; use rustframe::compute::stats::{pearson, covariance};
use rustframe::matrix::Matrix; use rustframe::matrix::Matrix;
let x = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2); let x = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
let y = Matrix::from_vec(vec![2.0, 4.0, 6.0, 8.0], 2, 2); let y = Matrix::from_vec(vec![2.0, 4.0, 6.0, 8.0], 2, 2);
let corr = pearson(&x, &y); let corr = pearson(&x, &y);
let cov = covariance(&x, &y);
assert!((corr - 1.0).abs() < 1e-8);
assert!((cov - 2.5).abs() < 1e-8);
```
## Distributions
Probability distribution helpers are available for common PDFs and CDFs.
```rust
# extern crate rustframe;
use rustframe::compute::stats::distributions::normal_pdf;
use rustframe::matrix::Matrix;
let x = Matrix::from_vec(vec![0.0, 1.0], 1, 2);
let pdf = normal_pdf(x, 0.0, 1.0);
assert_eq!(pdf.data().len(), 2);
``` ```
With the basics covered, explore predictive models in the With the basics covered, explore predictive models in the

View File

@ -1,7 +1,8 @@
# Data Manipulation # Data Manipulation
RustFrame's `Frame` type couples tabular data with Rustframe's `Frame` type couples tabular data with
column labels and a typed row index. column labels and a typed row index. Frames expose a familiar API for loading
data, selecting rows or columns and performing aggregations.
## Creating a Frame ## Creating a Frame
@ -17,27 +18,60 @@ assert_eq!(frame["A"], vec![1.0, 2.0]);
## Indexing Rows ## Indexing Rows
Row labels can be integers, dates or a default range. Retrieving a row returns a
view that lets you inspect values by column name or position.
```rust
# extern crate rustframe;
# extern crate chrono;
use chrono::NaiveDate;
use rustframe::frame::{Frame, RowIndex};
use rustframe::matrix::Matrix;
let d = |y, m, d| NaiveDate::from_ymd_opt(y, m, d).unwrap();
let data = Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]);
let index = RowIndex::Date(vec![d(2024, 1, 1), d(2024, 1, 2)]);
let mut frame = Frame::new(data, vec!["A", "B"], Some(index));
assert_eq!(frame.get_row_date(d(2024, 1, 2))["B"], 4.0);
// mutate by row key
frame.get_row_date_mut(d(2024, 1, 1)).set_by_index(0, 9.0);
assert_eq!(frame.get_row_date(d(2024, 1, 1))["A"], 9.0);
```
## Column operations
Columns can be inserted, renamed, removed or reordered in place.
```rust ```rust
# extern crate rustframe; # extern crate rustframe;
use rustframe::frame::{Frame, RowIndex}; use rustframe::frame::{Frame, RowIndex};
use rustframe::matrix::Matrix; use rustframe::matrix::Matrix;
let data = Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]); let data = Matrix::from_cols(vec![vec![1, 2], vec![3, 4]]);
let index = RowIndex::Int(vec![10, 20]); let mut frame = Frame::new(data, vec!["X", "Y"], Some(RowIndex::Range(0..2)));
let frame = Frame::new(data, vec!["A", "B"], Some(index));
assert_eq!(frame.get_row(20)["B"], 4.0); frame.add_column("Z", vec![5, 6]);
frame.rename("Y", "W");
let removed = frame.delete_column("X");
assert_eq!(removed, vec![1, 2]);
frame.sort_columns();
assert_eq!(frame.columns(), &["W", "Z"]);
``` ```
## Aggregations ## Aggregations
Any numeric aggregation available on `Matrix` is forwarded to `Frame`.
```rust ```rust
# extern crate rustframe; # extern crate rustframe;
use rustframe::frame::Frame; use rustframe::frame::Frame;
use rustframe::matrix::{Matrix, SeriesOps}; use rustframe::matrix::{Matrix, SeriesOps};
let frame = Frame::new(Matrix::from_cols(vec![vec![1.0, 2.0]]), vec!["A"], None); let frame = Frame::new(Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]), vec!["A", "B"], None);
assert_eq!(frame.sum_vertical(), vec![3.0]); assert_eq!(frame.sum_vertical(), vec![3.0, 7.0]);
assert_eq!(frame.sum_horizontal(), vec![4.0, 6.0]);
``` ```
When you're ready to run analytics, continue to the With the basics covered, continue to the [compute features](./compute.md)
[compute features](./compute.md) chapter. chapter for statistics and analytics.

View File

@ -1,10 +1,38 @@
# Introduction # Introduction
Welcome to the **RustFrame User Guide**. This book provides a tour of Welcome to the **Rustframe User Guide**. Rustframe is a lightweight dataframe
RustFrame's capabilities from basic data handling to advanced machine learning and math toolkit for Rust written in 100% safe Rust. It focuses on keeping the
workflows. Each chapter contains runnable snippets so you can follow along. API approachable while offering handy features for small analytical or
educational projects.
1. [Data manipulation](./data-manipulation.md) for loading and transforming data. Rustframe bundles:
2. [Compute features](./compute.md) for statistics and analytics.
3. [Machine learning](./machine-learning.md) for predictive models. - columnlabelled frames built on a fast columnmajor matrix
4. [Utilities](./utilities.md) for supporting helpers and upcoming modules. - familiar elementwise math and aggregation routines
- a growing `compute` module for statistics and machine learning
- utilities for dates and random numbers
```rust
# extern crate rustframe;
use rustframe::{frame::Frame, matrix::{Matrix, SeriesOps}};
let data = Matrix::from_cols(vec![vec![1.0, 2.0], vec![3.0, 4.0]]);
let frame = Frame::new(data, vec!["A", "B"], None);
// Perform column wise aggregation
assert_eq!(frame.sum_vertical(), vec![3.0, 7.0]);
```
## Resources
- [GitHub repository](https://github.com/Magnus167/rustframe)
- [Crates.io](https://crates.io/crates/rustframe) & [API docs](https://docs.rs/rustframe)
- [Code coverage](https://codecov.io/gh/Magnus167/rustframe)
This guide walks through the main building blocks of the library. Each chapter
contains runnable snippets so you can follow along:
1. [Data manipulation](./data-manipulation.md) for loading and transforming data
2. [Compute features](./compute.md) for statistics and analytics
3. [Machine learning](./machine-learning.md) for predictive models
4. [Utilities](./utilities.md) for supporting helpers and upcoming modules

View File

@ -1,11 +1,17 @@
# Machine Learning # Machine Learning
RustFrame ships with several algorithms: The `compute::models` module bundles several learning algorithms that operate on
`Matrix` structures. These examples highlight the basic training and prediction
APIs. For more endtoend walkthroughs see the examples directory in the
repository.
Currently implemented models include:
- Linear and logistic regression - Linear and logistic regression
- K-means clustering - Kmeans clustering
- Principal component analysis (PCA) - Principal component analysis (PCA)
- Naive Bayes and dense neural networks - Gaussian Naive Bayes
- Dense neural networks
## Linear Regression ## Linear Regression
@ -37,3 +43,34 @@ let cluster = model.predict(&new_point)[0];
For helper functions and upcoming modules, visit the For helper functions and upcoming modules, visit the
[utilities](./utilities.md) section. [utilities](./utilities.md) section.
## Logistic Regression
```rust
# extern crate rustframe;
use rustframe::compute::models::logreg::LogReg;
use rustframe::matrix::Matrix;
let x = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 4, 1);
let y = Matrix::from_vec(vec![0.0, 0.0, 1.0, 1.0], 4, 1);
let mut model = LogReg::new(1);
model.fit(&x, &y, 0.1, 200);
let preds = model.predict_proba(&x);
assert_eq!(preds.rows(), 4);
```
## Principal Component Analysis
```rust
# extern crate rustframe;
use rustframe::compute::models::pca::PCA;
use rustframe::matrix::Matrix;
let data = Matrix::from_vec(vec![1.0, 2.0, 3.0, 4.0], 2, 2);
let pca = PCA::fit(&data, 1, 0);
let transformed = pca.transform(&data);
assert_eq!(transformed.cols(), 1);
```
For helper functions and upcoming modules, visit the
[utilities](./utilities.md) section.

View File

@ -3,16 +3,36 @@
Utilities provide handy helpers around the core library. Existing tools Utilities provide handy helpers around the core library. Existing tools
include: include:
- Date utilities for generating calendar sequences. - Date utilities for generating calendar sequences and businessday sets
- Random number generators for simulations and testing
## Date Helpers ## Date Helpers
```rust ```rust
# extern crate rustframe; # extern crate rustframe;
use rustframe::utils::dateutils::{DatesList, DateFreq}; use rustframe::utils::dateutils::{BDatesList, BDateFreq, DatesList, DateFreq};
// Calendar sequence
let list = DatesList::new("2024-01-01".into(), "2024-01-03".into(), DateFreq::Daily); let list = DatesList::new("2024-01-01".into(), "2024-01-03".into(), DateFreq::Daily);
assert_eq!(list.count().unwrap(), 3); assert_eq!(list.count().unwrap(), 3);
// Business days starting from 20240102
let bdates = BDatesList::from_n_periods("2024-01-02".into(), BDateFreq::Daily, 3).unwrap();
assert_eq!(bdates.list().unwrap().len(), 3);
```
## Random Numbers
The `random` module offers deterministic and cryptographically secure RNGs.
```rust
# extern crate rustframe;
use rustframe::random::{Prng, Rng};
let mut rng = Prng::new(42);
let v1 = rng.next_u64();
let v2 = rng.next_u64();
assert_ne!(v1, v2);
``` ```
Upcoming utilities will cover: Upcoming utilities will cover: