https://github.com/mtresnik/intel
A machine learning library for kotlin JVM.
https://github.com/mtresnik/intel
apache2 constraint-satisfaction-problem decision-trees gradle k-means kd-tree kotlin library n-queens svm
Last synced: about 2 months ago
JSON representation
A machine learning library for kotlin JVM.
- Host: GitHub
- URL: https://github.com/mtresnik/intel
- Owner: mtresnik
- License: apache-2.0
- Created: 2021-07-26T15:36:58.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2022-05-15T16:07:26.000Z (about 4 years ago)
- Last Synced: 2023-07-15T23:26:02.419Z (almost 3 years ago)
- Topics: apache2, constraint-satisfaction-problem, decision-trees, gradle, k-means, kd-tree, kotlin, library, n-queens, svm
- Language: Kotlin
- Homepage:
- Size: 839 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# intel
[](https://github.com/mtresnik/intel/actions/workflows/gradle.yml/)
[](https://github.com/mtresnik/intel/packages/1423052)
[](https://github.com/mtresnik/intel/blob/main/LICENSE)
[](https://makeapullrequest.com)
## Dependencies
> Please compile against [com.resnik.math:1.0.0](https://github.com/mtresnik/math/packages/1409888) as well as this project.
## Getting Started
> This is a slightly different process to that of [com.resnik.math](https://github.com/mtresnik/math/).
**~/.m2/settings.xml:**
```xml
...
github
...
github
GITHUB_USERNAME
GITHUB_PAT
```
**pom.xml:**
```xml
github
https://maven.pkg.github.com/mtresnik/math
true
...
com.resnik
math
1.0.0
com.resnik
intel
1.0.0
```
### Gradle (groovy)
**~/.gradle/gradle.properties:**
```groovy
gpr.user=GITHUB_USERNAME
gpr.token=GITHUB_PAT
```
**build.gradle:**
```groovy
repositories {
...
maven {
name="GitHubPackages"
url= uri("https://maven.pkg.github.com/mtresnik/math")
credentials {
// Runner stored in env, else stored in ~/.gradle/gradle.properties
username = System.getenv("USERNAME") ?: findProperty("gpr.user") ?: ""
password = System.getenv("TOKEN") ?: findProperty("gpr.token")
}
}
...
maven {
name="GitHubPackages"
url= uri("https://maven.pkg.github.com/mtresnik/intel")
credentials {
// Runner stored in env, else stored in ~/.gradle/gradle.properties
username = System.getenv("USERNAME") ?: findProperty("gpr.user") ?: ""
password = System.getenv("TOKEN") ?: findProperty("gpr.token")
}
}
}
dependencies {
...
implementation group: 'com.resnik', name: 'math', version: '1.0.0'
implementation group: 'com.resnik', name: 'intel', version: '1.0.0'
...
}
```
## Constraint Satisfaction Problems (CSP's)
Finding solutions to smaller CSP's is simple enough using recursion, but harder for larger / more complex domains. For larger domains we run into the `StackOverflowException` and `Java OEM Exception` since the solutionset is stored at each point on the stack in a recursion model.
**CSPAgent** : A non-recursive Depth-First-Search (DFS) graph explorer with a Stack of parents.
**CSPSingleParentAgent** : A `CSPAgent` with only one parent (used in Async CSP's)
### CSP Types:
* **CSPTree** : Synchronous, linear DFS on the search space using a `CSPAgent`.
* **CSPDomainAsync** : Separates the first variable's domain into |D| `CSPAgent`'s on separate sections of the search space.
* **CSPInferredAsync** : Uses `THREAD_COUNT` `CSPAgent`'s on separate sections of the search space.
* **CSPCoroutine** : Uses kotlin coroutines in JVM Thread Pools.
> If unsure which CSP type to use, just call `CSPFactory`
```kotlin
val variables = listOf("A", "B", "C")
val domain = listOf(1, 2, 3)
val domainMap = variables.associateWith{ domain }
val csp = CSPFactory.createCSP(domainMap)
csp.addConstraint(GlobalAllDiff())
val solutions = csp.findAllSolutions()
val timeTaken = csp.finalTime()
```
### Constraints:
Constraints are used to limit the search space given the initial variables and domains. **Local Constraints** provide *path consistency* while **Global Constraints** provide *absolute consistency*.
**Local Constraints** imply that local consistency tends to global consistency. Used for monotone solutions. Ex: `sumLessThan()`, `localAllDiff()`
**Global Constraints** imply that the entire sequence is needed to know consistency. Ex: `isName()`, `equals()`
**Reusable Constraints** accomplishes both local and global consistency. These can both prune the search space while exploring and be reused as global constraints at the end. Ex: `min()`, `max()`
> **_NOTE:_** High computation constraints should be weighed against search space complexity.
#### Worst Case Complexity (No Solutions, No Constraints):
`// Had to change the background on this one for dark-mode legibility`

### Examples
#### Map Coloring
| Same Sized Rects - Trivial Case | Different Sized Rects - First Solution |
| -------- | -------- |
|
|
|
#### n-Queens
| (n=8, solutions=92, time=101ms) | (n=14, solutions=365596, time=49.435s) |
| -------- | -------- |
|
|
|
## Quad Trees and k-d Trees
Speed Limit Sign (quad tree, distance threshold = 40 / 255.0)
Sunday in the Park With George (k-d tree, uniform random seeds = 1000)

## K-Means Clustering
### Euclidean Distance Relations
K-Means Shown as Points (k=20, seedPoints=40)

**Seed Points**: Sample Data the model is clustered on.
```kotlin=1
val kMeans : KMeans
// Initialize the kMeans with seed points
val closestCluster : Cluster = kMeans[samplePoint]
```
For initializing a kMeans clustering, use `KMeans.getBestKMeans()`.
This will minimize the kMeans variance over the number of iterations.
```kotlin=4
// kMeansVariance: high = poor clustering
val kMeansVariance = kMeans.getMeanVariance()
// each cluster has variance as well
val clusterVariance = closestCluster.getVariance()
```
### Image Compression
For image compression, your number of means is the number of unique colors you want. As k increases, from 8 to 16, 32, 64, etc, the quality of the image increases. Similarly, the number of seed points needed from the original image may increase when k is small to preserve information content.
Video:
[](https://www.youtube.com/watch?v=4C_0dY91V1U)
## Proof of Concept Work
### Neural Networks
#### Feed Forward Neural Network
(xor problem, training set=4, data={(0,0),(0)}, {(0,1),(1)}, {(1,0),(1)}, {(1,1),(0)})

#### Recurrant Neural Networks / Convolutional Neural Networks (WIP)
More Tensor iteration work is needed before these are *fully fleshed out.
### Support Vector Machines (SVM's)

### Decision Trees
Example of decision trees on the Tennis Dataset but generalized for other Attribute types.
#### Tennis Dataset
Attributes = (Outlook, Temperature, Humidity, Wind, PlayTennis)
Outlook = (Sunny, Overcast, Rain)
Temperature = (Hot, Mild, Cool)
Humidity = (High, Normal)
Wind = (Weak, Strong)
PlayTennis = (No, Yes)
```kotlin
val schema = Schema(outlook, temperature, humidity, wind, tennis)
val dataset = Dataset(schema, tennis) // schema, target
dataset.addEntry(Entry(schema, sunny, hot, high, weak, no))
// ...
// Add other Entries
// ...
val decisionTree = dataset.buildTree()
val result = decisionTree.traverse("Sunny", "Mild", "High", "Strong")
// The typeof(result) is the same as the TargetAttribute.type
```
In general, you can have any number of `Attribute`'s each with different types in the same `Schema`.
### Genetic Algorithms (WIP)
Mainly used for String breeding algorithms and minimization functions.
> Note: Future work will include multi-threaded breeders and speciation in a custom bioinformatics library.