https://github.com/kweinmeister/notebooks
Jupyter notebooks for learning and demonstrations
https://github.com/kweinmeister/notebooks
machine-learning notebook python tensorflow
Last synced: about 1 year ago
JSON representation
Jupyter notebooks for learning and demonstrations
- Host: GitHub
- URL: https://github.com/kweinmeister/notebooks
- Owner: kweinmeister
- License: apache-2.0
- Created: 2019-02-25T19:48:34.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2025-06-03T20:40:29.000Z (about 1 year ago)
- Last Synced: 2025-06-04T07:47:02.504Z (about 1 year ago)
- Topics: machine-learning, notebook, python, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 510 KB
- Stars: 11
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
**This is not an official Google product.**
# Notebooks
- [Notebooks](#notebooks)
- [Google Drive ZIP to GitHub Repository Exporter](#google-drive-zip-to-github-repository-exporter)
- [Identifying LLM "Tells": N-gram Analysis of Human vs. AI Text](#identifying-llm-tells-n-gram-analysis-of-human-vs-ai-text)
- [Querying a GitHub Codebase with Vertex AI RAG Engine](#querying-a-github-codebase-with-vertex-ai-rag-engine)
- [Product Data Enrichment with Vertex AI](#product-data-enrichment-with-vertex-ai)
- [Causal Inference with Vertex AI AutoML Forecasting](#causal-inference-with-vertex-ai-automl-forecasting)
- [Medical Imaging notebooks using Vertex AI](#medical-imaging-notebooks-using-vertex-ai)
- [Understand how your TensorFlow model is making predictions](#understand-how-your-tensorflow-model-is-making-predictions)
- [20 Newsgroups data import script for Google Cloud AutoML Natural Language](#20-newsgroups-data-import-script-for-google-cloud-automl-natural-language)
- [How to use the Google Cloud Natural Language API](#how-to-use-the-google-cloud-natural-language-api)
## Google Drive ZIP to GitHub Repository Exporter
This [notebook](zip-to-repo.ipynb) provides a streamlined workflow to take a ZIP file from your Google Drive and push its contents into a new or existing GitHub repository.
## Identifying LLM "Tells": N-gram Analysis of Human vs. AI Text
This [notebook](detecting_ai_text_signatures.ipynb) aims to identify characteristic words and phrases (n-grams) that are statistically more likely to appear in text generated by a Large Language Model compared to human-written text. This process helps in understanding the stylistic differences between human and AI-generated content.
## Querying a GitHub Codebase with Vertex AI RAG Engine
This [notebook](rag_codebase.ipynb) demonstrates how to use Vertex AI's
Retrieval-Augmented Generation (RAG) capabilities to index the code files from a
public GitHub repository and then ask questions about that codebase using a generative model.
It uses [Vertex AI RAG Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/rag-engine/rag-overview),
a component of the Vertex AI Platform.
## Product Data Enrichment with Vertex AI
This [notebook](Product_Data_Enrichment_with_Vertex_AI.ipynb) demonstrates how
to enrich your data using Generative AI with Vertex AI on Google Cloud.
The specific example is a retail use case for improving product description
metadata. Better product descriptions lead to more user engagement and higher
conversion rates.
## Causal Inference with Vertex AI AutoML Forecasting
This [notebook](causal_inference_with_vertex_ai_automl_forecasting.ipynb)
introduces the concept of causal inference. It shows how to estimate the effect
of an intervention using the
[tfcausalimpact](https://github.com/WillianFuks/tfcausalimpact) library and with
[Vertex AI AutoML
Forecasting](https://cloud.google.com/vertex-ai/docs/training/automl-console#forecasting).
## Medical Imaging notebooks using Vertex AI
The [pipeline notebook](medical_imaging_pipeline.ipynb) should be run first. It
will pre-process DICOM medical images in the dataset (which needs to be
downloaded prior to running). Then, it will create an AutoML model, and deploy
it to an endpoint. It demonstrates how to build a pipeline using standard and
custom components.
The [custom training notebook](medical_imaging_custom_training.ipynb) can be run
afterward. It shows how to train a TensorFlow model using the same managed
dataset.
## Understand how your TensorFlow model is making predictions
This [notebook](tensorflow-shap-college-debt.ipynb) demonstrates how to build a
model using [tf.keras](https://www.tensorflow.org/api_docs/python/tf/keras) and
then analyze its feature importances using the
[SHAP](https://github.com/slundberg/shap) library.
The model predicts the expected debt-to-earnings ratio of a university's
graduates. It uses data from the US Department of Education's [College
Scorecard](https://collegescorecard.ed.gov/data/).
More details about the model can be found in the [blog
post](https://medium.com/@kweinmeister/understand-how-your-tensorflow-model-is-making-predictions-d0b3c7e88500).
You can run the model [live in Colab with zero setup
here](https://colab.research.google.com/github/kweinmeister/notebooks/blob/master/tensorflow-shap-college-debt.ipynb).
To run it locally, make sure you have Jupyter installed (`pip install jupyter`).
I've included the model code as a Jupyter notebook
(`tensorflow-shap-college-debt.ipynb`). From the root directory run `jupyter
notebook` to start your notebook. Then navigate to `localhost:8888` and click on
`tensorflow-shap-college-debt.ipynb`.
## 20 Newsgroups data import script for Google Cloud AutoML Natural Language
This [notebook](20_newsgroups_automl.ipynb) downloads the [20 newsgroups
dataset](https://scikit-learn.org/0.19/datasets/twenty_newsgroups.html) using
scikit-learn. This dataset contains about 18000 posts from 20 newsgroups, and is
useful for text classification. The script transforms the data into a pandas
dataframe and finally into a CSV file readable by [Google Cloud AutoML Natural
Language](https://cloud.google.com/natural-language/automl).
## How to use the Google Cloud Natural Language API
This [notebook](google_cloud_natural_language_api.ipynb) demonstrates how to
perform natural language tasks such as entity extraction, text classification,
sentiment analysis, and syntax analysis using the [Google Cloud Natural Language
API](https://cloud.google.com/natural-language/docs).
[def]: