https://github.com/suryadevarapranav/stegano-llm
Project Work for the course COMP 642, Spring'24 (Rice University).
https://github.com/suryadevarapranav/stegano-llm
llms machine-learning steganography
Last synced: 4 months ago
JSON representation
Project Work for the course COMP 642, Spring'24 (Rice University).
- Host: GitHub
- URL: https://github.com/suryadevarapranav/stegano-llm
- Owner: suryadevarapranav
- Created: 2024-04-29T21:59:30.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2024-08-09T15:12:05.000Z (11 months ago)
- Last Synced: 2024-08-09T16:56:07.582Z (11 months ago)
- Topics: llms, machine-learning, steganography
- Language: Jupyter Notebook
- Homepage: https://github.com/suryadevarapranav/Stegano-LLM/blob/master/ps102_dp78_Final_Report.pdf
- Size: 14.3 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Stegano-LLM: AI-Powered Steganography
## COMP 642 - Machine Learning, Spring 24
#### Course Instructor: Dr. Anshumali Shrivastava**Team Members**:
[Divit Patidar](https://github.com/divitcr7) - [email protected]
Pranav Suryadevara - [email protected]## Introduction
Stegano-LLM is an innovative project that leverages the power of large language models (LLMs) to revolutionize the field of steganography. By seamlessly integrating cutting-edge artificial intelligence and natural language processing techniques, this project introduces a novel approach to concealing sensitive data within AI-generated textual prompts.
## Motivation
In today's data-driven world, ensuring the security and privacy of information is of paramount importance. Traditional steganographic methods, while effective, often rely on predefined patterns or algorithms, making them vulnerable to detection by sophisticated analysis techniques. Additionally, with the rapid advancements in AI and language models, there is a growing need for steganographic techniques that can adapt and evolve alongside these technologies.
Stegano-LLM addresses these challenges by harnessing the power of state-of-the-art language models, such as Dolly-v-3B from Databricks, to generate highly realistic and contextual textual prompts. By ingeniously embedding sensitive data within these AI-generated prompts, Stegano-LLM offers a robust and adaptable solution for secure data transmission, digital watermarking, covert communication, and tamper detection.
## Approach
The Stegano-LLM project follows a multi-step process that involves various techniques for image data handling, prompt generation, and steganographic embedding:
1. **Image Encoding**: The project begins by converting image files into base64-encoded strings, effectively transforming the binary image data into a textual representation.
2. **Prompt Generation**: A large language model, such as Dolly-v-3B, is employed to generate relevant and naturalistic descriptive prompts for each image irrelevant to it to maintain security. These prompts serve as creative captions, without any prior knowledge of the actual image contents.
3. **Steganographic Embedding**: The base64-encoded image data is seamlessly embedded within the generated prompts using sophisticated steganographic techniques. This embedding process conceals the presence of any hidden data from casual observers.
4. **Data Extraction**: The project includes functionality to securely extract and reconstruct the original base64 data from the embedded prompts, allowing for the retrieval and decoding of the concealed image information.
## Applications
The Stegano-LLM project has numerous potential applications, including but not limited to:
- **Secure Data Transmission**: Sensitive information can be securely transmitted by embedding it within AI-generated textual prompts, ensuring confidentiality and resistance to unauthorized access.
- **Digital Watermarking**: The project can be utilized for digital watermarking purposes, embedding copyright or ownership information within AI-generated content.
- **Covert Communication**: Stegano-LLM enables covert communication channels by concealing sensitive messages within innocuous-looking textual prompts.
- **Tamper Detection**: By embedding digital signatures or checksums within AI-generated content, the project can facilitate tamper detection and ensure the integrity of data.
## Getting Started
To get started with the Stegano-LLM project, follow these steps:
1. Clone the repository: `git clone https://github.com/your-repo/stegano-llm.git`
## Contributing
Contributions to the Stegano-LLM project are welcome! If you have any ideas, bug fixes, or improvements, please open an issue or submit a pull request. Make sure to follow the project's contributing guidelines.
## License
This project is licensed under the [MIT License](LICENSE).
## Acknowledgments
We would like to express our gratitude to the developers of the Dolly-v-3B language model from Databricks, as well as the open-source community for their valuable contributions to the field of artificial intelligence and natural language processing.