Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tintinweb/hallucinate.sol
😵💫 A Recurrent Neural Network (RNN) hallucinating solidity source code.
https://github.com/tintinweb/hallucinate.sol
solidity tensorflow text-prediction
Last synced: 3 months ago
JSON representation
😵💫 A Recurrent Neural Network (RNN) hallucinating solidity source code.
- Host: GitHub
- URL: https://github.com/tintinweb/hallucinate.sol
- Owner: tintinweb
- Created: 2021-11-12T09:24:52.000Z (about 3 years ago)
- Default Branch: master
- Last Pushed: 2021-11-12T12:35:44.000Z (about 3 years ago)
- Last Synced: 2024-10-03T12:23:51.107Z (4 months ago)
- Topics: solidity, tensorflow, text-prediction
- Language: Jupyter Notebook
- Homepage:
- Size: 42.4 MB
- Stars: 42
- Watchers: 3
- Forks: 7
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[](https://diligence.consensys.net)
[[ 🌐 ](https://diligence.consensys.net) [ 📩 ](https://github.com/ConsenSys/vscode-solidity-doppelganger/blob/master/mailto:[email protected]) [ 🔥 ](https://consensys.github.io/diligence/)]This is a PoC for HackWek! - a Diligence internal 5-day Hackathon 🥷⚔️.
TLDR; My plan was to have fun with tensorflow, RNN's, text-prediction, and connect this to solidity smart contracts 🙌. This is an excerpt of my journey.# Hallucinate.sol
😵💫 A Recurrent Neural Network (RNN) hallucinating solidity source code.
* We train our model on samples from the https://github.com/tintinweb/smart-contract-sanctuary
* And then "hallucinate" new contracts![image](https://user-images.githubusercontent.com/2865694/141459177-49d9d800-6da5-4736-b7f5-761546532160.png)
**Note**: train the model on https://colab.research.google.com/ as it is much faster than doing this locally.
## Interactive Playground
Copy the python notebook to your own collab/google drive and runit.
Hint: Google Collab → Runtime → Change Runtime Type: GPU
* 👉 [Tutorial 2 - load & hallucinate](https://drive.google.com/file/d/16vQX3SVxmqmkXfwWut38YPfcrRq4SlAE/view?usp=sharing)
* 👉 [Tutorial 1 - train & hallucinate](https://drive.google.com/file/d/13Z6Ak7UCUf6sMvCujeym2A6Bio8mTv66/view?usp=sharing)## Contents
| Folder | Description |
| ------------ | ------------- |
| [solidity_model_text](./solidity_model_text/) | contains a pre-trained model trained on 15mb solidity input, naive character based training, with sampling sequence length of 250 chars. The model has an `embedding_dimension` of `256` with `1024` `rnn_units`. It was trained for `15 epochs` on google collab (hw-accelleration: `GPU`) which took somewhere between 1-1.5 hrs. |
| [Tutorial 2: load & hallucinate](./tutorial_2_hallucinate_from_pretrained_model.ipynb) | loads the pre-trained model from [./solidity_model_text/](./solidity_model_text/) and hallucinates more solidity. |
| [Tutorial 1: train & hallucinate](./tutorial_1_train_and_hallucinate_save_restore_continue_training.ipynb) | is the code that downloads samples from https://github.com/tintinweb/smart-contract-sanctuary, creates the model, trains it, hallucinates some text, and then continues to show how to save/restore/re-train the model. |* **Note**: The model can be exported for use with [tensorflow.js](https://www.tensorflow.org/js) so that it can be used with any javascript/web-front/backend. See [Tutorial 1](./tutorial_1_train_and_hallucinate_save_restore_continue_training.ipynb) for how to do this.
* **Note**: The model can also be used for non-solidity code. Just make sure to write your own `SolidityTrainer` class 🙌.## Improvements
Of course, there's no way to explore everything in this 5-day HackWek period, but, here're a couple of thoughts on what to improve:
* vocabulary should be based on tokentype_text instead of chars. E.g. use `pygments` to lex `solidity` and map this as the vocabulary. This should give way higher quality output and allows the model to learn the source structure more efficiently.
* input cleanup should reliably remove all comments/pragmas/etc.
* loss function should reinforce training towards fuzzy-parseable code
* shuffle before downloading contract sources
* continuous learning. re-train with more sources (not only 15mb 😂)
* the pre-trained `solidity_model_text` is pretty shitty an will generate a lot of garbage. Obviously, 15 epochs is not enough and the text based shuffling approach makes no sense. But at least it is generating something 😂## Example
Copy the two tutorials to your google drive and run them.
**Input:**
```python
>>> print(trainingData.predict(['contract '], 3000))
```**Output:**
```solidity
contract Ownable {
address public owner;
event OwnershipTransferred(address indexed previousOwner, address indexed newOwner);
function Ownable() public {
owner = msg.sender;
}
modifier onlyOwner() {
require(msg.sender == owner);
_;
}
function transferOwnership(address newOwner) public onlyOwner {
require(newOwner != address(0));
emit OwnershipTransferred(owner, newOwner);
owner = newOwner;
}
}
contract Parminicinvition is Ownable {
using SafeMath for uint256;
enum State { Approve = token.totalSupply();
require(tokens >= summaryTokens.add(bonus));
totalDailydested = totalEthInWei + msg.value;
totalSoldTokens = token.totalSupply();
emit Transfer(address(0), 0xCf49B9298aC4d4933a7D6984d89A49aDc84A6CA602BA513D872C3,21f36325D28718](0));
totalSupply = totalSupply.mul(totalValue.add(soldSignedMap[tokensBough.mul(1)));
restributedPluyRates[msg.sender] = true;
nonStokenSupplyFinallow
}
if(opits[msg.sender].amount <= totalSupply)) ether;
}
assignOpe(address(this).balance, weiAmount);
require(canTra_secrecover(_approved) >= rNo(_reward, _weight, _amount);
totalAmount = totalAmount.add(_amount);
Transfer(_addr, msg.sender, amount);
}
...
```## Credits
Based on the [TensorFlow Text Generation Tutorial](https://www.tensorflow.org/text/tutorials/text_generation)